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Abstract — Distributed adaptive filtering lias been considered 
as an effective approach for data processing and estimation over 
distributed networks. Most existing distributed adaptive filtering 
algorithms focus on designing different information diffusion 
rules, regardless of the nature evolutionary characteristic of a 
distributed network. In this paper, we study the adaptive network 
from the game theoretic perspective and formulate the distributed 
adaptive filtering problem as a graphical evolutionary game. 
With the proposed formulation, the nodes in the network are 
regarded as players and the local combination of estimation 
information from different neighbors is regarded as different 
strategies selection. We show that this graphical evolutionary 
game framework is very general and can unify the existing 
adaptive network algorithms. Based on the framework, we 
further propose an error-aware adaptive filtering algorithm 
which does not depend on any network statistical information. 
Moreover, we use graphical evolutionary game theory to analyze 
the information diffusion process over the adaptive networks and 
evolutionarily stable strategy of the system. Finally, simulation 
results are shown to verify the effectiveness of our analysis and 
proposed method. 

Index Terms — Adaptive filtering, graphical evolutionary game, 
distributed estimation, adaptive networks, data diffusion. 

I. Introduction 

Recently, the concept of adaptive filter network derived 
from the traditional adaptive filtering was emerging, where 
a group of nodes cooperatively estimate some parameters of 
interest from noisy measurements IT]. Such a distributed esti- 
mation architecture can be applied to many scenarios, such as 
wireless sensor networks for environment monitoring, wireless 
Ad-hoc networks for military event localization, distributed 
cooperative sensing in cognitive radio networks and so on 
|I2lL3J. Compared with the classical centralized architecture, 
the distributed one is not only more robust when the center 
node may be dysfunctional, but also more flexible when the 
nodes are with mobility. Therefore, distributed adaptive filter 
network has been considered as an effective approach for the 
implementation of data fusion, diffusion and processing over 
distributed networks [4J. 

In a distributed adaptive filter network, at every time instant 
t, node i receives a set of data {di{t),ul} that satisfies a linear 
regression model as follow 

d,{t) =ulw° + v,{t), (1) 

where w'^ is a deterministic but unknown M x 1 vector, di {t) 
is a scalar measurement of some random process di, u\ is 
the 1 X M regression vector at time t with zero mean and 



covariance matrix R^^ = E(u**u*) > 0, and Vi[t) is the 
random noise signal at time t with zero mean and variance 
(jf. Note that the regression data u\ and measurement process 
di are temporally white and spatially independent, respectively 
and mutually. The objective for each node is to use the data 
set {di{t),u\} to estimate parameter w^. 

In the literatures, many distributed adaptive filtering algo- 
rithms have been proposed for the estimation of parameter . 
The incremental algorithms, in which node i update w through 
combining the observed data sets of itself and node i — 1, were 
proposed, e.g., incremental LMS algorithm |5|. Unlike the in- 
cremental algorithms, the diffusion algorithms allow node i to 
combine the data sets from all neighbors, e.g., diffusion LMS 
|6||7| and diffusion RLS [8|. Besides, the projection-based 
adaptive filtering algorithms were summarized in |9|, e.g., the 
projected subgradient algorithm lIOl and the combine-project- 
adapt algorithm [ 11 1. In 1 12], the authors considered the node's 
mobility and analyzed the mobile adaptive networks. 

While achieving promising performance, these traditional 
distributed adaptive filtering algorithms mainly focused on 
designing different information combination rules or diffusion 
rules among the neighborhood. However, on one hand, in a 
distributed network, to enforce all the nodes to follow some 
predefined rules may be impractical. Instead, the activities of 
nodes probably follow some nature evolutionary rules instead 
of artificially designed rules. On the other hand, although 
various kinds of combination rules have been developed, 
there is no general framework that can reveal the unified 
fundamentals of distributed adaptive filtering problem. In this 
paper, we will use the evolutionary game theory to formulate 
the distributed adaptive filtering problem and propose a general 
framework that can unify the existing algorithms. 

The main contributions of this paper are summarized as 
follows. 

1) We propose a graphical evolutionary theoretic frame- 
work for the distributed adaptive networks, where nodes 
in the network are regarded as players and the local 
combination of estimation information from different 
neighbors is regarded as different strategies selection. 
Such a framework is very general that can unify existing 
adaptive filtering algorithms as special cases. 

2) Based on the proposed game theoretic framework, we 
further propose an error-aware distributed adaptive fil- 
tering algorithm. While achieving better mean-square 
performances than existing adaptive filtering algorithms, 
the proposed algorithm does not depend on any network 
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TABLE I 
Different Combination Rules. 



Fig. 1. Left: centralized model. Right: distributed model. 



topology and statistical information. 

3) Using the graphical evolutionary game theory, we ana- 
lyze the information diffusion process over the adaptive 
network, and derive the diffusion probability of infor- 
mation from good nodes. 

4) We prove that the strategy of using information from 
good nodes is evolutionarily stable strategy either in 
complete graphs or incomplete graphs. 

The rest of this paper is organized as follows. We summarize 
the existing works in Section II. In Section III, we describe 
in details how to formulate the distributed adaptive filtering 
problem as a graphical evolutionary game. We then discuss 
the information diffusion process over the adaptive network 
in Section IV, and further analyze the evolutionarily stable 
strategy in Section V. Simulation results are shown in Section 
VI. Finally, we draw conclusions in Section VII. 

II. Related Works 

Let us consider an adaptive filter network with N nodes. 
If there is a fusion center that can collect information from 
all nodes, then global (centralized) optimization methods can 
be used to derive the optimal updating rule for the parameter 
w, where ii; is a deterministic but unknown i\/ x 1 vector for 
estimation, as shown in the left part of Fig.[T] For example, 
in the global LMS algorithm, the parameter updating rule can 
be written as ||6i 



,,t+i 



TV 
i=l 



(2) 



where ^ is the step size. With (|2|i, we can see that the central- 
ized LMS algorithm requires the information of 
across the whole network, which is generally impractical. 
Moreover, such a centralized architecture highly relies on the 
fusion center and will collapse when the fusion center is 
dysfunctional or some data links are disconnected. 

If there is no fusion center in the network, then each node 
needs to exchange information with the neighbors to update 
the parameter as shown in the right part of Fig.[T] In the 
literature, several distributed adaptive filtering algorithms have 
been introduced, such as distributed incremental algorithms 
[|5|, distributed LMS IS) ill, and projection-based algorithms 
|[T0||11|. These distributed algorithms are based on the clas- 
sical adaptive filtering algorithms, where the difference is that 
nodes can use information from neighbors to estimate the 
parameter . Taking one of the distributed LMS algorithms. 
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Adaption-then-Combine Diffusion LMS (ATC) f6\, as an 
example, the parameter updating rule for node i is 
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(3) 



where Mi denotes the neighboring nodes set of node i (includ- 
ing node i itself), Ci j and j are linear weights satisfying the 
following conditions 

Oij = 0, if i iM, 

(4) 



N 

E Cij" 
J = l 



= 1, 



N 

E = 1. 



In a practical scenario, since the exchange of full raw 
data {di{t),u\] among neighbors is costly, the weight 
is usually set as = 0, if j ^ i, as in |6|. In such a 
case, for node i with degree rii (including node i itself, i.e., 
the cardinality of set Mi), we can write the general parameter 
updating rule as 



M(f{w{),F{wI),...,F{i 
Mj)F{w)), 



(5) 



where F{-) can be any adaptive filtering algorithm, e.g. 
F(wl) = w\ + ^uf{di{t) ~ u\w\) for the LMS algorithm, 
Ai{-) represents some specific linear combination rule. Eqn. 
Q gives a general form of existing distributed adaptive 
filtering algorithms, where the combination rule Ai{-) mainly 
determines the performance. Table U summarizes the existing 
combination rules, where for all rules Ai{i) = 0, if j ^ Mi- 

From Table U we can see that the weights of the first four 
combination rules are purely based on the network topology. 
The disadvantage of such topology-based rules is that, they are 
sensitive to the spatial variation of signal and noise statistics 
across the network. The relative degree-variance rule shows 
better mean-square performance than others, which, however, 
requires the knowledge of all neighbors' noise variances. As 
discussed in Section I, all these distributed algorithms are only 
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focusing on designing the combination rules. Nevertheless, a 
distributed network is just like a natural ecological system and 
the nodes are just like individuals in the system, which may 
spontaneously follow some nature evolutionary rules, instead 
of some specific artificially predefined rules. Besides, although 
various kinds of combination rules have been developed, 
there is no general framework which can reveal the unifying 
fundamentals of distributed adaptive filtering problems. In the 
sequel, we will use graphical evolutionary game theory to 
establish a general framework to unify existing algorithms and 
give insights of the distributed adaptive filtering problem. 

III. Graphical Evolutionary Game Formulation 
A. Introduction of Graphical Evolutionary Game 

Evolutionary game theory (EGT) is originated from the 
study of ecological biology [17|. It was first adopted to study 
the gene evolution, where the genes whose strategies are more 
successful will have higher fitness and higher probability to 
be reproduced. In such a case, the population fractions of 
genes whose fitness is higher than the average level of the 
whole population will tend to grow at a faster rate. Since 
genes with lower fitness strategies are gradually eliminated 
during the dynamic process, the stable steady states after the 
evolution converges must be an Nash equilibrium. Later on, 
EGT has been widely used to model users' behaviors in image 
processing ITSl . as well as communication and networking 
area |19||2(F|, such as congestion control 121], cooperative 
sensing [ 22] , cooperative peer-to-peer (P2P) streaming L23j 
and dynamic spectrum access ll24ll . In these literatures, evo- 
lutionary game has been shown to be an effective approach 
to model the dynamic social interactions among users in a 
network. In EGT, there are two important concepts: replicator 
dynamics and evolutionarily stable strategy f25l. 

1) Replicator Dynamics: EGT differs from the classical 
game theory by emphasising more on the dynamics and 
stability of the whole population's strategies, instead of only 
the property of the equilibrium. In a distributed environment, 
all players are uncertain about other players actions and 
utilities. To improve his/her own utility, each player will try 
different strategies in different rounds of play and learn from 
the interactions using the methodology of understanding-by- 
building. During this process, the proportion of players using 
a certain pure strategy may vary with time. In EGT, replicator 
dynamics are used to model such a population evolution. 

Let us consider an evolutionary game with m strategies 
X = {1,2,..., to}. The payoff matrix, [/, is an to x to matrix, 
whose entries, Uij , denote the payoff for strategy i versus 
strategy j. The population fraction of strategy i is given by 
Pi, where J^iLiPi — 1- The fitness of strategy i is given 
by fi = ^'^^ '■^^ average fitness of the whole 

population, we have = J^TLiPifi- Thus, the replicator 
dynamic equation is given by 125 1 

Pi ^ VPiifi - (l^), (6) 

where r; is a positive scale factor From (|6]l, we can see that 
if playing strategy i can lead to a higher fitness than the 
average level, the population fraction pi will increase and the 



increasing rate pi /pi is proportional to the difference between 
fi and (f). By setting p, = in (|6|l, the theoretical ESS of 
the game can be calculated through solving the equation. The 
Wright-Fisher model has been widely adopted to let a group of 
players converge to the ESS [26], where the strategy updating 
equation for each player can be written as 

P^{t)fdt) 



P,it+l) = 



(7) 



From (|7]l, it can be seen that the strategy updating process 
in the evolutionary game is similar to the parameter updating 
process in adaptive filter problem. It is intuitive that we can 
use evolutionary game to formulate the distributed adaptive 
filter problem. 

2) Evolutionarily Stable Strategy: EGT is an effective 
approach to study how a group of players converges to a stable 
equilibrium after a period of strategic interactions. Such an 
equilibrium strategy is defined as the Evolutionarily Stable 
Strategy (ESS). For an evolutionary game with N players, a 



strategy profile a* = (a*, 



), where Oi € X, is an ESS 



if and only if, Va 7^ a*, a* satisfies follows ll25l : 

1) f/,(a„ alj a*_,), (8) 

2) if t/,(a„ a*_,) = a*_,), 

C/,(a^, a_0 < U,{a*, a_,), (9) 

where Ui stands for the utility of player i and a_i denotes the 
strategies of all players other than player i. We can see that 
the first condition is the Nash equilibrium (NE) condition, and 
the second condition guarantees the stability of the strategy. 
Moreover, we can also see that a strict NE is always an ESS. 
If all players adopt the ESS, then no mutant strategy could 
invade the population under the influence of natural selection. 
Even if a small part of players may not be rational and take 
out-of-equilibrium strategies, ESS is still a locally stable state. 

3) Graphical Evolutionary Game: The classical evolution- 
ary game theory considers a population of M individuals 
in a complete graph. However, in many scenarios, players' 
spatial locations may lead to an incomplete graph structure. 
Graphical evolutionary game theory is introduced to study the 
strategies evolution in such a finite structured population [27 1, 
where each vertex represents a player and each edge represents 
the reproductive relationship between valid neighbors, i.e., 6*^ 
denotes the probability that the strategy of node i will replace 
that of node j, as shown in Fig.|2] Graphical EGT focuses on 
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Fig. 2. Graphical evolutionary game model. 
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(a) BD update rule. 
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(b) DB update rule. 
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Initial population Selection for update Selection for imitation 

(c) IM update rule. 

Fig. 3. Three different update rules, where death selections are shown in dark blue and birth selections are shown in red. 



Imitation 



analyzing the ability of a mutant gene to overtake a group of 
finite structured residents. One of the most important research 
issues in graphical EGT is how to compute the fixation 
probability, the probability that the mutant will eventually 
overtake the whole structured population ll28l . In this paper, 
we will use graphical EGT to formulate the dynamic parameter 
updating process in a distributed adaptive filter network. 

B. Graphical Evolutionary Game Formulation 

In graphical EGT, each player updates strategy according 
to his/her fitness after interacting with neighbors in each 
round. Similarly, in distributed adaptive filtering, each node 
updates its parameter w through incorporating the neighbors' 
information. In such a case, we can treat the nodes in a 
distributed filter network as players in a graphical evolutionary 
game. For node i with Ui neighbors, it has rii strategies 
{1, 2, 7ii}, where strategy j means updating wl^^ using 
the update information from its neighbor j, A{w*j). We can 
see that ^ represents the adoption of mixed strategy. In such 
a case, the parameter updating in distributed adaptive filter 
network can be regarded as the strategy updating in graphical 
EGT. 

We first discuss how players' strategies are updated in 
graphical EGT, which is then applied to the parameter updating 
in distributed adaptive filtering. In graphical EGT, the fitness 



of a player is locally determined from interactions with all 
adjacent players, which is defined as ll29ll 

(10) 



f = (1- a) ■ B + a-U, 



where B is the baseline fitness and U is the player's payoff 
which is determined by the predefined payoff matrix. The 
parameter a represents the selection intensity, i.e., the relative 
contribution of the game to fitness. The case a — > represents 
the limit of weak selection ll30l . while a — I denotes strong 
selection, where fitness equals payoff. There are three different 
strategy updating rules for the evolution dynamics, called as 
birth-death (BD), death-birth (DB) and imitation (IM) IITI . 

• BD update rule: a player is chosen for reproduction 
with the probability being proportional to fitness (Birth 
process). Then, the chosen player's strategy replaces one 
neighbor's strategy uniformly (Death process), as shown 
in Fig.[3}(a). 

• DB update rule: a random player is chosen to abandon 
his/her current strategy (Death process). Then, the chosen 
player adopts one of his/her neighbors' strategies with 
the probability being proportional to their fitness (Birth 
process), as shown in Fig.[3]-(b). 

• IM update rule: each player either adopts the strategy 
of one neighbor or remains with his/her current strategy, 
with the probability being proportional to fitness, as 
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shown in Fig.[3}(c). 
These three kinds of strategy updating rules can be matched 
to three different kinds of parameter updating algorithms in 
distributed adaptive filtering. Suppose that there are N nodes 
in a structured network, where the degree of node i is n^. We 
use Af to denote the set of all nodes and Afi to denote the 
neighborhood set of node i, including node i itself. 

For the BD update rule, the probability that node i adopts 
strategy j, i.e., using updated information from its neighbor 
node j, is 



f, 



1 



(11) 



is the probability that the neigh 



where the first term 

boring node j is chosen to reproduction, which is proportional 
to its fitness fj, and the second term ^ is the probability that 
node i is chosen for adopting strategy j. In such a case, the 
equivalent parameter updating rule for node i can be written 
by 



,,t+i 



E 



1 



E 



E 



ke^r fk nj 



\Fiwl). (12) 



Similarly, for the DB updating rule, we can obtain the corre- 
sponding parameter updating rule for node i as 




For the IM updating rule, we have 



,,t+i 



E 



F{w 



F{wl). (13) 



(14) 



The performance of adaptive filtering algorithm is usually 
evaluated by two measures: mean-square deviation (MSD) and 
excess-mean-square error (EMSE), which are defined as 



MSD = E||u;* - w°\\^, 
EMSE = E\\u\w*-^ -1 



(15) 
(16) 



Using (fTZb . ( fTlT l and ST4i . we can calculate the network MSD 
and EMSE of these three update rules according to ||6|. 

C. Relationship to Existing Distributed Adaptive Filtering 
Algorithms 

In Section II, we have summarized the existing distributed 
adaptive filtering algorithms in (|5]l and Table J] In this sub- 
section, we will show that all these algorithms are the special 
cases of the IM update rule in our proposed graphical EGT 
framework. Compare ^ with ( fT4] i. we can see that different 
fitness definitions are corresponding to different distributed 
adaptive filtering algorithms in Table J] For the Uniform rule. 



TABLE II 
Different Fitness Definition. 



Name 


Fitness: fj = 


Unifonn fTTIfTal 


1, for all j G Mi 


Maximum degree |81|14I 


1 1, for j 7^ i, 
1 N — rii + 1, for j = i. 


Laplacian fl5|(16| 


1 1 1 for i 7^ * 
1 nmax - + 1, for j = i. 


Relative degree |8| 


rij , for all j £ J\fi 


Relative degree-variance |6 


rijaj'^ , for all j S Afi 



the fitness can be uniformly defined as fi = 1 and using the 
IM update rule, we have 



= E 



n, 



(17) 



which is equivalent to the uniform algorithm. Here, the defi- 
nition of /i = 1 means the adoption of fixed fitness and weak 
selection (a << 1). For the Laplacian rule, when updating the 
parameter of node i, the fitness of nodes in Afi can be defined 
as 



1, 
n„ 



for j ^ i, 
1, forj = i. 



(18) 



From ( fTsT l. we can see that each node gives more weight to 
the information from itself through enhancing its own fitness. 
Similarly, for the Relative-degree-variance rule, the fitness can 
be defined as 



fj = rijcr^ ^, for all j G M- 



(19) 



Table |ll] summarizes the different fitness definitions corre- 
sponding to different combination rules in Table HI 

D. Error-aware Distributed Adaptive Filtering Algorithm 

As discussed in Section II, the existing distributed adaptive 
filtering algorithms either rely on the prior knowledge of 
network topology or the requirement of additional network 
statistics. All of them are not robust to a dynamic network, 
where a node location may change and the noise variance 
of each node may also vary with time. Considering these 
problems, we propose an error-aware distributed algorithm 
based on the intuition that nodes with low mean-square-error 
(MSE) should be given more weight while nodes with high 
MSE should be given less weight. The instantaneous MSE of 
node i, denoted by /3i, can be calculated by 



i3i = \m) 



(20) 



where only local data {di{t),ul} are used. We assume that 
nodes can exchange their instantaneous MSE information with 
neighbors. The fitness of each node is defined as 



f^ 



(21) 



5 




(23) 



node updates its parameter Wi, it has the following two 
strategies: 

{Sr, using information from common nodes, 
S„i, using information from good nodes. 
In such a case, we can define the payoff matrix as follow: 



Fig. 4. Graphical evolutionary game model. 



"3 ^4 



(24) 



where A is a positive coefficient. In such a case, using the IM 
update rule, we have 



w 



t+i 



E 



(22) 



From (I22b . we can see that the proposed algorithm is not 
directly dependent with any network topology information. 
Moreover, it can also adapt to a dynamic environment when 
the noise variance of some nodes suddenly change since 
the weights will be immediately adjusted accordingly. In the 
next section, we will verify the performance of the proposed 
algorithm through simulation. 

IV. Diffusion Analysis 

In a distributed adaptive filter network, there are nodes 
with good signals, i.e., lower noise variance a^, as well as 
nodes with poor signals. The principal objective of distributed 
adaptive filtering algorithms is to stimulate the diffusion of 
good signals to the whole network to enhance the network 
mean-square performances. In this section, we will use the 
EGT to analyze such a dynamic diffusion process and derive 
the close-form expression for the diffusion probability. 

In a graphical evolutionary game, the structured population 
are either residents or mutants. An important concept is the 
fixation probability, which represents the probability that the 
mutant will eventually overtake the whole population ll32l . 
Let us consider a local adaptive filter network as shown in 
Fig. in where the hollow points denote common nodes, i.e., 
nodes with common noise variance ar', and the solid points 
denote good nodes, i.e., nodes with a lower noise variance 
am- Here, we adopt the binary signal model to better reveal 
the diffusion process of good signals. If we regard the common 
nodes as residents and the good nodes as mutants, the concept 
of fixation probability in EGT can be applied to analyze 
the diffusion of good signals in the network. According to 
the definition of fixation probability, we define the diffusion 
probability in a distributed filter network as the probability that 
a good signal can be adopted by all nodes to update parameters 
in the network. 



A. Strategies and Payojf Matrix 

As shown in Fig. HI for the node at the center, its neighbors 
include both common nodes and good nodes. When the center 



where 7r(x,y) represents the EMSE of node with noise vari- 
ance X using information from node with noise variance y. 
For example, 7r((Tr,cr,„) is the EMSE of node with noise 
variance (Tr adopting strategy S',„, i.e., updating its w using 
information from node with noise variance o-„, which in turn 
adopts strategy Sr ■ The following Lemma 1 shows the quality 
of the payoff matrix. 

Lemma 1: The payoff matrix defined in (l24l l has the quality 
as follow 

ui < Us < U2 < U4- (25) 

Proof: The EMSE 7r(x,y) is determined by the noise 
variances of both nodes, as well as the combination rule. 
According to ll33l . the optimal 7r(x,y) can be calculated by 

7r(a;, y) = ciCTi +C2^, (26) 

^2 



ci 



_ MTr(H„) 
~ 4 ' 



(27) 



where C = col{Ci, Cat} consists of the eigenvalues of il„ 
(recall that /i is the step size and Ru is the covariance matrix 
of the observed regression data m*). According to (l26T l and 



(28) 
(29) 





= Cidl + 2C2, 






7r(crr, CTm) 


1 2 1 2 


-2C2 






1 2 1 2 


-2C2 


2 


7r(crm, CTm) 


= cio-^ + 2c2. 







(30) 
(31) 

In such a case, we can see that 7r((Tr,o'r) > 7r(crr,o'm) > 
TT{am,crr) > n{<Tm,,<Jm), which implies that ui < < U2 < 
U4. This complete the proof of the lemma. ■ 
In the following, we will analyze the diffusion process of 
strategy S,n, i-e., the ability of good signals diffusing over 
the whole network. We consider an adaptive filter network 
based on a homogenous graph with general degree n and 
adopt the IM update rule for the parameter update ll34l . Let 
Pr and pm denote the percentages of nodes using strategies 
Sr and Sm in the population, respectively. Let prr, Prm, Pmr 
and Pmm denote the percentages of edge, where p^m means 
the percentage of edge on which both nodes use strategy Sr 
and S,n- Let qm\r denote the conditional probability of a 
node using strategy Sm given that the adjacent node is using 
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Strategy Sr, similar we have q^j.^, (7r|,„ and qjn\m- In such a 
case, we have 



Pr ~t~ Pm 
<lr\X + Qm\X 
PXY 
Prrn 



1, 
1, 

py ■ qx\Y, 

Pmr 1 



(32) 
(33) 
(34) 
(35) 



where X and Y are either r or m. The equations ( l32ll35T l 
imply that the state of the whole network can be described by 
only two variables, p,„ and qm\m- In the following, we will 
calculate the dynamics of pm and qm\m under the IM update 
rule. 

B. Dynamics of and qm\m. 

According to the IM update rule, a node using strategy Sr 
is selected for imitation with probability pr- As shown in the 
left part of Fig.lH among its n neighbors (not including itself), 
there are rir nodes using strategy Sr and Um nodes using 
strategy Sm, respectively, where Ur + nm = n. The percentage 
of such a configuration is ("'"^gr"™ g"'-. In such a case, the 
fitness of this node is 

/o = (1 - a) + a{nrUi + n„jU2), (36) 

where the baseline fitness is normalized as 1. Among those n 
neighbors, the fitness of node using strategy Sm is 

/,„ = (l-a)+a(^[(?i-l)9r|,„+l]u3+(n-l)gm|mU4), (37) 
and the fitness of node using strategy Sr is 

/,, = {\-a) + a(^[{n-l)qr\r + l\ui + {n-l)qm\rU'^. (38) 

In such a case, the probability that the node using strategy Sr 
is replaced by Sm is 

'^mfm 



Pr 



(39) 



fm + TLrfr + /o 

Therefore, the percentage of nodes using strategy Sm, Pm, 
increases by 1/iV with probability 



Prob(^Ap,„ = -j^^ = Pr X! 



nr+nn-i—n 



m\r^r\r 



frr 



(40) 



fm + nrfr + /o 

Meanwhile, the edges that both nodes use strategy Sm increase 
by Urn, thus, we have 

1 ^m\r^r\r 
^mfm 



Prob( Ap,„„ = 



nN 



= Pr 



rimfm + n-rfr + fo 



(41) 



Similar analysis can be applied to the node using strategy 
Sm- According to the IM update rule, a node using strategy 
Sm is selected for imitation with probability pm- As shown 
in the right part of Fig. HI we also assume that there are rir 
nodes using strategy Sr and Um nodes using strategy Sm 



among its n neighbors. The percentage of such a phenomenon 
is ("""jg"™ fl"'- . Thus, the fitness of this node is 

50 (1 - a) + a{nrU2 + n„iU3). (42) 
Among those n neighbors, the fitness of node using strategy 

Sm is 

ffm = {l-a)+a{^{n-l)qr\mU:i+[{n-l)qm\m + ^]uij, (43) 
and the fitness of node using strategy Sr is 

gr = {l-a) + a(^{n-l)qr\rUi+[{n-\)qm\r + '^]u'^- (44) 

In such a case, the probability that the node using strategy Sm 
is replaced by Sr is 



TlrQr 



nmgm + rirQr + .90 



(45) 



Therefore, the percentage of nodes using strategy Sm, Pn 
decreases by with probability 



Prob 



1 



rir 



™ 1 „nm „nr 



^ ^ I ^ ]^rn\m1r\'i 

iir+njn—n 
Urgr 



nmgm + nrgr + go 



(46) 



Meanwhile, the edges that both nodes use strategy Sm de- 
crease by Um, thus, we have 



Prob 



2nr 



nN 



P,i 



— • (47) 

rimgrn + Tlrgr + go 

Combining (l40b and ( |46] |. we have the dynamics of pm as 

Pm = ^Prob(Ap„, ^ 1) - ^Prob(Ap„ = -1) 
an{n — l)pr 



N{n+iy 



-(71M1 + 72U2 + 73U3 + 74M4) + 0(a ),(48) 



where the second equality is according to Taylor's Theorem 
and weak selection assumption with a goes to zero ||35]| . 
In such a case, the payoff obtained from the interactions is 
considered as limited contribution to the overall fitness of each 
player On one hand, the results derived from weak selection 
often remain as valid approximations for larger selection 
strength [30|. On the other hand, the weak selection limit has a 
long tradition in theoretical biology |36|. Moreover, the weak 
selection assumption can help achieve a close-form analysis 
of diffusion process and better reveal how the strategy diffuses 
over the network. The parameters 71, 72, 73 and 74 in (l48b 
are given as follows: 

71 = -qr\r[{n - l){qr\r + qm\,n) + 3], (49) 

2 

72 = -9m|m - 9m|r[(" " l)(9r|r + 9m|m) +2] -,(50) 

2 

73 = qr\r + qr\rn[{n - l){qr\r + qrn\m) + 2] H -, (51) 

71—1 

74 = qm\m[{n - l){qr\r + 9m|m) + 3]. (52) 
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Similarly, by combining ( 1411 1 and ( |47] i. we have the dynamics 

of Pmm 



— ^Prob(^Ap„^ 

nm=0 

" 9 



2n^ 



2pr 



= +'^l)jV ('^ ^ ^ l)(9m|r-'?m|rn)) +0(a).(53) 

Besides, we can also have the dynamics of q„i\m as 



dt \ Pn 

= r^7TT7— (l + ("-l)('?™|,.-9™|,n)) +0(a).(54) 

[n + l)N Prn V ' ' / 

C. Diffusion Probability Analysis 

The dynamic equation of p„i in (l48T l reflects the the dynamic 
of nodes updating w using information from good nodes, 
i.e., the diffusion status of good signals in the network. A 
positive pm means that good signals are diffusing over the 
network, while a negative pm means that good signals have not 
been well adopted. The diffusion probability of good signals 
is closely related to the noise variance of good nodes (t„i. 
Intuitively, the lower a^n, the higher probability that good 
signals can spread the whole network. In this subsection, 
we will analyze the close-form expression for the diffusion 
probability. 

As discussed at the beginning of Section IV, the state of 
whole network can be described by only p,„ and q,n\m- In 
such a case, ( |48] | and ( l54l l can be re-written as functions of 

Pm and qm\m 

p,n = a ■ Gi{pm, qrn\m) + O(a^), (55) 

Q.rn\m 

= G2{pm,qm\m) + 0(a). (56) 

From (ISST i and (|56] |, we can see that qm\m converges to 
equilibrium in a much faster rate than pm under the assumption 
of weak selection. At the steady state of qm\m, i-e., qm\m = 0, 
we have ^ 

(57) 



Qm I r 



1 



In such a case, the dynamic network will rapidly converge onto 
the slow manifold, defined by G2{pm,qm\m) = 0. Therefore, 
we can assume that ( |57] | holds in the whole convergence 
process of Pm- According to (l32Ti-(l35]l and (ISTT i. we have 

9m|m = Pm. + r~~T(l ^ ^'™)' ^^^^ 



1 



qm\r 
qr\r 



n- 1 
n-2 



Pn 



(1 - Pm), 



= 1- 



1 

n - 2 



1 



P,i 



(59) 
(60) 
(61) 



Therefore, the diffusion process can be characterized by only 
Pm- Thus, we can focus on the dynamics of p,„ to derive the 
diffusion probability, which is given by following Theorem 1. 



Theorem 1: In a distributed adaptive filter network which 
can be characterized by a A^-node regular graph with degree 
n, suppose there are common nodes with noise variance 
and good nodes with noise variance am, where each common 
node has connection edge with only one good node. If each 
node updates its parameter w using the IM update rule, the 
diffusion probability of the good signal is 

1 anN 

Pdm = — TT + F? — r~rvT(^i'"i+^2M2 + 6"3 + C4W4), (62) 
n + 1 6(n + 1)'' 

where the parameters ^i, and ^4 are as follows: 



^1 = -2n^-5n + 3, 

= -n^ - n - 3, 

^3 = 2n^ + 2n - 3, 
^4 = + 4n + 3. 



(63) 
(64) 
(65) 
(66) 



Proof: First, let us define m{pm) as the mean of the 
increment of pm per unit time given as follows 



m{pm) 



Pn 



l/N 

an{n — 2) 

- 7 TT? — r-rT^P'n[^ ^ Prn)[apm + b). (67) 

[n — l)[n + ly 

where the second step is derived by substituting (ISTTi-jMTl into 
(|48] l and the parameters a and b are given as follows: 

a = (n — 2){n + 3)(ui — U2 — U3 + U4), (68) 
b= -(r^-l)(n+3)wl-3M2 + ("■^ + ?^-3)u3+(7^+3)u4.(69) 

We then define v{pm) as the variance of the increment of pm 
per unit time, which can be calculated by 

Pm -{Pmf 



V{pm) 

where can be computed by 



l/N 



(70) 



Pi. = ^ (prob(Ap,„ = 1) + Prob(Ap„. = -1)^ 



n{n - 2) 



rPm(l -Pm) + 0(a). 



iV2 (n- l)(n+ 1) 
In such a case, v{pm) can be approximated by 

2 n{n - 2) 



(71) 



v{Pn 



Pm{l-Pm)- (72) 



N {n-l){n + iy 

Suppose the initial percentage of good nodes in the network is 
Pmo- Let us define H{pmo) as the probability that these good 
signals can finally be adopted by the whole network, i.e., all 
nodes can update their own w using information from good 
nodes. According to the backward Kolmogorov equation ll37l . 
H{p„m) satisfies following differential equation 

dH{pmo) , v(Pmo) d?H{pmo) 

O = m(p„jo)— 5 1 ■ (73) 

dpmo 2 dpf„Q 

With the weak selection assumption, we can have the approx- 
imate solution of H{pmo) as 

aN ( \ 

{Pmo) = PmO + — — Pmo(l " Pmo) + 3&) + apmO ■ 

6(n+l) V / 

(74) 
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Let us consider the worst initial system state that each 
common node has connection with only one good node, i,e., 
Pmo = we have 



1 



1 



anN 



H{ ~ - + - —{a + 3b). (75) 

\n+lj n+1 6(n+l)3 

By substituting ( |68] ) and ( |69] l into ( fTSl l. we can have the close- 
form expression for the diffusion probability in ( |62] |. This 
completes the proof of the theorem. ■ 

Remark: From (|74] |. we can see that there are two terms 
constituting the expression of diffusion probability: the initial 
percentage of strategy Sm, Pmo (the initial system state) and 
the second term representing the changes of system state 
after beginning, in which a + 3b determines whether pm is 
increasing or decreasing along with the system updating. If 
a + 36 < 0, i.e., the diffusion probability is even lower than the 
initial percentage of strategy Sm, the information from good 
nodes are shrinking over the network, instead of spreading. 
Therefore, a + 36 > is more favorable for the improvement 
of the adaptive network performance. 

Using Theorem 1, we can calculate the diffusion probability 
of the good signals over the network, which can be used 
to evaluate the performance of an adaptive filter network. 
Similarly, the diffusion dynamics and probabilities under BD 
and DB update rules can also be derived using the same 
analysis. The following theorem shows an interesting result, 
which is based on an important theorem in |28|, stating that 
evolutionary dynamics under BD, DB, and IM are equivalent 
for undirected regular graphs. 

Theorem 2: In a distributed adaptive filter network which 
can be characterized by a A^-node regular graph with degree n, 
suppose there are common nodes with noise variance ct^ and 
good nodes with noise variance am, where each common node 
has connection edge with only one good node. If each node 
updates its parameter w using the IM update rule, the diffusion 
probabilities of good signals under BD and DB update rules 
are same with that under the IM update rule. 

V. EVOLUTIONARILY STABLE STRATEGY 

In the last section, we have analyzed the information diffu- 
sion process in an adaptive network under the IM update rule, 
and derived the diffusion probability of strategy Sm that using 
information from good nodes. On the other hand, considering 
that if the whole network has akeady chosen to adopt this 
favorable strategy 5',,,, is the current state a stable network 
state, even though a small fraction of nodes adopt the other 
strategy 5^? In the following, we will answer these questions 
using the concept of evolutionarily stable strategy (ESS) in 
evolutionary game theory. As discussed in Section III-A, the 
ESS ensures that one strategy is resistant against invasion of 
another strategy ll38l . In our system model, it is obvious that 
Sm, i-C-, using information from good nodes, is the favorable 
strategy and a desired ESS in the network. In this section, we 
will check whether strategy Sm is evolutionarily stable. 

A. ESS in Complete Graphs 

We first discuss whether strategy Sm is an ESS in complete 
graphs, which is shown by the following theorem. 



Theorem 3: In a distributed adaptive filter network that can 
be characterized by complete graphs, strategy Sm is always 
an ESS strategy. 

Proof: In a complete graph, each node meets every other 
node equally likely. In such a case, according to the payoff 
matrix in (|24] |. the average payoffs of using strategies Sr and 
Sm are given by 

Ur = PrUi +PmU2, (76) 
Um = PrU3 + PmUi, (77) 

where pr and pm are the percentages of population using 
strategies Sr and Sm, respectively. Consider the scenario that 
the majority of the population adopt strategy Sm, while a small 
fraction of the population adopt Sr which is considered as 
invasion, pr — e. In such a case, according to the definition 
of ESS in (|9]l, strategy Sm is evolutionary stable if Um > Ur 
for {pr,Pm) = (e, 1 - e), i.e., 

e(u3 - Ml) + (1 - e)(w4 - U2) > 0. (78) 

For e — > 0, the left hand side of (iTSl l is positive if and only if 

"u4 > M2" or "w4 = U2 and M3 > ui". (79) 

The (|79] | gives the sufficient evolutionary stable condition of 
strategy Sm- In our system, we have U4 > M2 > U3 > ui 
according to Lemma 1, which means that (|79] l always holds. 
Therefore, strategy Sm is always an ESS if the adaptive filter 
network is a complete graph. ■ 

B. ESS in Incomplete Graphs 

Let us consider an adaptive filter network which can be 
characterized by an incomplete regular graph with degree n. 
The following theorem shows that strategy Sm is always an 
ESS in such an incomplete graph. 

Theorem 4: In a distributed adaptive filter network which 
can be characterized by a regular graph with degree n, strategy 
Sm is always an ESS strategy. 

Proof: Using the pair approximation method fST), the 
replicator dynamics of strategies Sm and Sr on a regular graph 
of degree n can be approximated simply by 

Pr = PriPrU'i+PmU'2- (f)), (80) 
Pm = Pra{PrU':i+PmU'i~ (f), (81) 

where 4> = prPru[ +PrPm{u'2 + ^3) + PmPmu\ is the average 
payoff, and u'^, u'2, u'^ and U4 are given as follows: 

u[ = Ml, 

U'n = U2 + U', 



U3 = "3 - U', 



(82) 



1*4 



M4. 



The parameter u' depends on the three update rules (IM, BD 
and DB), which is given by fTl] 

{n+ 3)ui + U2 — M3 - (n+ 3)m4 



IM: 



BD: 



DB: 



(n + 3)(n-2) 
{n+ l)ui + M2 — U3 — {n+ l)u4 



Ul 



{n + l){n 

U2 — U3 — Ui 



n-2 



(83) 
(84) 
(85) 
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10 12 14 16 18 

Node i 



Fig. 5. Network information for simulation, including network topology for 20 nodes (left), trace of regressor covariance Tr(i?n) (right top) and noise 
variance ai (right bottom). 



In such a case, the equivalent payoff matrix is 

Sr f Ui U2+ u' 

Sm \ U^- u' U4 



(86) 



According to 
strategy Sm is 



the evolutionary stable condition for 

U4> U2+ u'. (87) 



With Lemma 1, we can see that u' < for all three update 
rules. In such a case, (|87] | always holds, which means that 
strategy S,n is always an ESS strategy. This completes the 
proof of the theorem. ■ 

VI. Simulation Results 

In this section, we develop simulations to compare the 
performances of different adaptive filtering algorithms, as well 
as to verify the derivation of information diffusion probability 
and the analysis of ESS. 

A. Mean-square Performances Comparison 

The network topology used for simulation is shown in the 
left part of Fig.|5] where 20 randomly nodes are randomly 
located. The signal and noise power information of each node 
are also shown in the right part of Fig.|5] respectively. In the 
simulation, we assume that the regressors with size M — 5, are 
zero-mean Gaussian and independent in time and space. The 
unknown vector is set to be w'^ = 1^/^/2 and the step size of 
the LMS algorithm at each node i is set as /i^ = 0.01. All the 
simulation results are averaged over 500 independent runnings. 
All the performance comparisons are conducted among three 
different kinds of distributed LMS adaptive filtering algorithms 
as follows: 

• Relative degree algorithm [|8l; 

• Relative degree-variance algorithm |l6l; 

• Proposed error-aware algorithm with A = 0.1, 1 and 5. 
Fig. |6] shows the transient network-performance comparison 

results among three kinds of algorithms in terms of EMSE 
and MSD. We can see that our proposed algorithm is always 
better than the relative degree algorithm with maximal 2.5dB 
performance enhancement (78% less error). Moreover, the 
larger A is set, the lower EMSE and MSD of the proposed 



algorithm can be achieved, but with a slower convergence rate. 
WTien A > 1, our proposed algorithm can perform better than 
the relative degree-variance algorithm with about 0.5 — IdB 
performance enhancement (12% — 26% less error). As we 
discussed in Section 2, the relative degree-variance algorithm 
requires noise variance information of each node, while our 
proposed algorithm does not. 



Relative degree [8] 
Relative degree-variance [6] 
Proposed algorithm (>^=0.1) 
Proposed algorithm (1=1) 
Proposed algorithm {X=5) 




(a) Network EMSE. 



Relative degree [8] 
Relative degree-variance [6] 
Proposed algorithm (>.=0. 1 ) 
Proposed algorithm (A,=l) 
Proposed algorithm (A.=5) 




1000 

Time Index 
(b) Network MSD. 

Fig. 6. Transient performances comparison. 
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□ Relative degree [8] 
O Relative degree-variance [6] 
A Proposed algorithm (X=0. 1) 
)K Proposed algorithm (?^=1) 
—^—Proposed algorithm (X=5) 




Node Index 
(a) Node's EMSE. 




Node Index 
(b) Node's MSD. 



□ Relative degree [8] 
O Relative degree-variance [6] 
A Proposed algorithm (?i=0. 1) 
)K Proposed algorithm (X=l) 
Proposed algorithm (X=5) 




Node Index 
(a) Node's EMSE. 

-B B - 




14 16 18 20 



Node Index 
(b) Node's MSD. 



Fig. 7. Steady performances comparison. 



Fig. 8. 
time. 



Steady performances comparison when the noise variances vary over 



Fig. |7] shows the steady-state performances of each node for 
three kinds of distributed adaptive fihering algorithms in terms 
of EMSE and MSD. Since the steady-state resuh is for each 
node, besides averaging over 500 independent runnings, we 
average at each node over 100 time slots after the convergence. 
We can see that the comparison results of steady-state per- 
formances are similar to those of the transient performances. 
Moreover, although there are distinct differences among all 
nodes' noise variances as shown in Fig.|5}(c), the steady EMSE 
and MSD of all nodes are similar with each other, which 
is principally due to the cooperative data sharing and good 
information diffusion. 

To verify the robustness of our proposed algorithm, we also 
conduct simulations to compare the steady performances of 
three algorithms when the noise variance of each node is 
varying over time, as shown in Fig.|8] Based on the noise 
variances given in Fig.|5}(c), we let the noise variance of 
each node randomly vary between [—50%, +50%] along with 
simulation time. We can see that under such circumstances, our 
proposed algorithm is always better than the relative degree 
and the relative degree-variance algorithm with about 1 — 3dB 
performance enhancement (26% — 100% less error). Therefore, 
the simulation results verify that our proposed error-aware 
algorithm is resistant to the variation of nodes' noise variances. 

B. Diffusion Probability 

In this subsection, we develop simulation to verify the 
diffusion probability analysis in Section IV. For the simulation 
setup, three types of regular graphs are generated with degree 
n = 3, 4 and 6, respectively, as shown in Fig.|9}(a). All these 
three types of graphs are with N — 100 nodes, where each 



node's trace of regressor covariance is set to be Tr(i2M) = 10, 
the common nodes's noise variance is set as af. = 1.5 and 
the good node's noise variance is set as a^^ E [0.2,0.8]. In 
the simulation, the network is initialized with the state that all 




(a) Regular graph stmctmes with degree n — 3, 4 and 6. 




0.15 -L, . 1 : 1 : 1 : , : , : p 

0.2 0.3 0.4 0.5 0.6 0.7 0.8 

Noise Variance of Good Nodes a' 



(b) Diffusion probability. 
Fig. 9. Diffusion probabilities under three types of regular graphs. 
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common nodes choosing strategy Sr- Then, at each time step, 
a randomly chosen node's strategy is updated according to 
the IM rules under weak selection (w = 0.01), as illustrated 
in Section III-B. The update steps are repeated until either 
strategy Sm has reached fixation or the number of steps has 
reach the Umit. The diffusion probability is calculated by the 
fraction of runs where strategy Sm reached fixation out of 
10^ runs. Fig.|9l-(b) shows the simulation results, from which 
we can see that all the simulated results are basically accord 
with the corresponding theoretical results and the gaps are 
due to the approximation during the derivations. Moreover, we 
can see that the diffusion probability of good signal decreases 
along with the increase of its noise variance, i.e., better signal 
has better diffusion capability. 

C. Evolutionarily Stable Strategy 

To verify that strategy Sm is an ESS in the adaptive 
network, we further simulate the IM update rule on a 10 x 10 
grid network with degree n = 4 and number of nodes 
N = 100, as shown in Fig.[TO] where the hollow points 
represent common nodes and the solid nodes represent good 
nodes. In the simulation, all the settings are same with those in 
the simulation of diffusion probability in Section VI-B, except 
the initial network setting. The initial network state is set that 
the majority of nodes adopt strategy Sm. denoted with black 
color (including both hollow and solid nodes) in Fig.[T0l and 
only a very small percentage of nodes use strategy Sr denoted 
with red color. From the strategy updating process of the whole 
network illustrated in Fig.fTO] we can see that the network 
finally abandons the unfavorable strategy Sr, which verifies 
the stability of strategy Sm- 

VII. Conclusion 

In this paper, we proposed an evolutionary game theoretic 
framework to offer a very general view of the distributed 
adaptive filtering problems and unify existing algorithms. 



Based on this framework, we further designed an error-aware 
adaptive filtering algorithm. The simulation results showed 
that compared with existing algorithms, our proposed algo- 
rithm can achieve better mean-square performances without 
the knowledge of network statistical information. Using the 
graphical evolutionary game theory, we further analyzed the 
information diffusion process in the network under the IM 
update rule, and proved that the strategy of using information 
from nodes with good signal is always an ESS. Finally, the 
final simulation results verified our analysis. 
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